Interpretable Privacy Preservation of Text Representations Using Vector Steganography

نویسندگان

چکیده

Contextual word representations generated by language models learn spurious associations present in the training corpora. Adversaries can exploit these to reverse-engineer private attributes of entities mentioned These findings have led efforts towards minimizing privacy risks models. However, existing approaches lack interpretability, compromise on data utility and fail provide guarantees. Thus, goal my doctoral research is develop interpretable preservation text that maximize retention guarantee privacy. To this end, I aim study methods incorporate steganographic modifications within vector geometry obfuscate underlying retain distributional semantic properties learnt during training.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High capacity steganography tool for Arabic text using 'Kashida'

Steganography is the ability to hide secret information in a cover-media such as sound, pictures and text. A new approach is proposed to hide a secret into Arabic text cover media using "Kashida", an Arabic extension character. The proposed approach is an attempt to maximize the use of "Kashida" to hide more information in Arabic text cover-media. To approach this, some algorithms have been des...

متن کامل

Text comparison using word vector representations and dimensionality reduction

This paper describes a technique to compare large text sources using word vector representations (word2vec) and dimensionality reduction (tSNE) and how it can be implemented using Python. The technique provides a bird’s-eye view of text sources, e.g. text summaries and their source material, and enables users to explore text sources like a geographical map. Word vector representations capture m...

متن کامل

Combining Text Vector Representations for Information Retrieval

This paper suggests a novel representation for documents that is intended to improve precision. This representation is generated by combining two central techniques: Random Indexing; and Holographic Reduced Representations (HRRs). Random indexing uses co-occurrence information among words to generate semantic context vectors that are the sum of randomly generated term identity vectors. HRRs are...

متن کامل

Interpretable support vector regression

This paper deals with transforming Support vector regression (SVR) models into fuzzy systems (FIS). It is highlighted that trained support vector based models can be used for the construction of fuzzy rule-based regression models. However, the transformed support vector model does not automatically result in an interpretable fuzzy model. Training of a support vector model results a complex rule...

متن کامل

Privacy Preservation Using Multi-Context Systems

Preserving the privacy of sensitive data is one of the major challenges which the information society has to face. Traditional approaches focused on the infrastructure for identifying data which is to be kept private and for managing access rights to these data. However, while these efforts are useful, they do not address an important aspect: While the sensitive data itself can be protected nic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i11.21573